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Abstract : We present a new structural (or syntactic) approach for estimating the satisfiability threshold 
of random 3-SAT formulae. We show its efficiency in obtaining a jump from the previous upper bounds, 
lowering them to 4.506. The method combines well with other techniques, and also applies to other problems, 
such as the 3-colourability of random graphs. 
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The last decade has seen a growth of interest in phase transition phenomena in hard combi- 
natorial decision problems, due to resulting insights into their computational complexity and 
that of the associated optimization problems. There is a fast growing body of theoretical 
investigations as well as ones exploring algorithmic solver implications. Latterly, moreover, 
statistical physics studies have also shed new light on these phenomena, whence a further 
surge in interest. Among the various and extensive contributions, let us single out a few: 
1, |37|, |, H, H, [30, 
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13, a 39 



Several surveys can be found in [16[ . 
One of the most challenging phase transitions, with a long history of results, concerns the 
problem of 3-Satisfiability (to satisfy sets of clauses of length 3, i.e. disjunctions of 3 literals). 



P~6|] contains a survey which we briefly summarize and update here. Experiments strongly 



suggest that satisfiability of random 3-SAT formulae (the 3-SAT problem) exhibits a sharp 
threshold or a phase transition as a function of a control parameter, the ratio c of the number 
of clauses to the number of variables. More precisely, this would mean the existence of a 
critical value cq such that for any c < cq the probability of satisfiability of a random 3-SAT 
formula tends to 1 as n — > oo, and for c > Co it tends to 0. Over the years, two series of 
bounds for cq have been established, the lower bounds being : 2.9 (positive probability only), 
2/3, 1.63, 3.003, 3.145, 3.26, 3.42 (see 0, §, § f2T], 0, g |29J 
5.081, 4.762, 4.643, 4.602, 4.596, 4.571, 4.506 (see 0, [35 

last bound, 4.506, was briefly presented in [12|]. The present paper gives a detailed proof, 



and the upper bounds: 5.191 




11, 30, 24, 25 



The 



emphasizing the potential of the main innovation, which we called the structural or syntactic 
approach, in contrast to the semantic approach hitherto used to establish upper bounds. A 
few general comments are in order. Thanks to this structural approach, a jump from 4.643 
to 4.506 was obtained. Developments since then have confirmed the interest and versatil- 
ity of this technique. Further refinements of the semantic approach, together with subtle 
and sophisticated probabilistic and analytical results, have failed to match the 4.506 bound, 



*A preliminary short version of this paper apperared in the Proceedings of the Eleventh ACM-SIAM 
Symposium on Discrete Algorithms, pages 124-126, San Francisco, California, January 2000. 
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giving 4.571 as announced in P5[ . And we recently applied our structural approach to the 
equally challenging 3-colouring problem. It turned out to combine well with the decimation 
technique we had used for the 3-XORSAT problem [TJ], lowering the best upper bound from 
2.4945 (@ and references therein) to 2.427 p). 

In the ramainder of this section, we present the probabilistic model for 3-SAT we work with, 
then give an overview of our approach leading to the bound of 4.506. The subsequent sections 
contain the detailed calculations. 



1.1 Probabilistic model. 

Let V n = {xi, x n } be a set of n boolean variables, L n = {xi, x\, x n , x n } the correspond- 
ing set of positively and negatively signed literals. In this paper we use the ordered- clauses 
model. Here an n-formula F is simply a map to L n from the formula template A c n , an array 
of cn clause templates consisting each of 3 ordered places or cells. If the literal I is the image 
under F of cell £, we also say that it fills £. The set Q (n, c) of n-formulae is made into a 
probability space by assigning each formula the probability 1/ \ ft (n, c)| = (2n) _3cn . 

Each truth assignment A : V n {0,1} is conventionally extended to L n so that A(x,j) = 
1 — A (xj), and is said to satisfy the clause Ck if A (I) — 1 for some I G Ck, and the formula 
F if it satisfies all its clauses; in which case A is a solution of F, and F is satisfiable. The 
probability of satisfiability of a random formula F of Q (n, c) is denoted by Pr n)C (SAT) . 

A few words in comparison with the non- ordered- clauses model, also very usual. Here a 
clause is a set of 3 literals with distinct underlying variables, and a random formula is a 
sequence of m = cn clauses drawn independently and uniformly among the 2 3 (") possible 
clauses. Convergence to (resp. 1), as n — > oo, of Pr njC (SAT) is readily seen to imply the 
same for the probability in the non-ordered-clauses model. Thus our upper bound of 4.506, 
once proven in the ordered-clauses model, will hold in both. 



1.2 Outline. 

We give first a general idea of our approach stemming from concrete experiments. A 
computer-based generator of random formulae churns out mechanically, as the case may 
be, only satisfiable, or only contradictory formulae. To say that certain formulae are never 
produced (within a realistic timeframe) simply means that they form a set of vanishingly 
small probability; and, due to the very dumbness of the generator, the distinction between 
'likely' and 'unlikely' formulae must be possible on a very basic level, considering only their 
form or structure. Ideally, we would like an exact criterion for 'likely' or 'typical' formu- 
lae; possibly, then, the first moment method, restricted not to particular types of solutions, 
but to formulae with this particular property, might give us the exact value of Cq. Such an 
exact characterization is elusive, though, and unlikely to emerge in a simple, usable form. 
Rather, in this paper we show the usefulness of an uncomplicated partial characterization 
in terms of the numbers of occurrences and signs of the variables. The pure effect on the 
expectation of restricting the formulae becomes only part of the story. Equally important is 
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the fact that, far from interfering with other approaches, the added structure actually helps 
in otherwise difficult or hopeless enumerations. Thus we do not need, e.g., sophisticated 
probability results. And, particularly, we are able to introduce at virtually no cost some 
structural manipulations on the balancing of the signs of occurrences per variable which 
would be impractical in the purely semantic approach. On the other hand, to attain full 
rigour the method does require fairly lengthy calculations, notably to bound errors arising 
from the finite size of formulae, and thoroughly to justify the optimization procedures. These 
remain relatively elementary, though, and, in the case of the error estimates, fairly routine. 

Practically we first characterize the asymptotic distribution of the signed occurrences per 
variable, namely : 



Lemma 1.1 For any integers < p < x, define k X)P = 2 x (^jp(x, X), where A = 3c 
and p (x, A) is the Poisson probability mass function of mean X, i.e. p(x,X) = e~ x X x /x\. 
Let the random variable u XjP be the proportion of variables of a random formula having 
x occurrences, among which exactly p have a positive signature. Then for any e > 0, 
lim n ^oo Pr (\u X)P - k XjP \ > e) = 0. 



It can be seen easily that Lemma |1.1| implies that an upper bound on the satisfiability 
threshold is obtained by calculating the expected number of solutions of a typical formula 
(in a sense to be specified shortly, but roughly meaning that for most (x,p), there are nearly 
n X)P n variables having x total, p positive, occurrences). Typical formulae, however, also 
provide us a strong means to go further in the structural manipulation of formulae. But 
we need first to recall the definition of particular solutions which in [jll| we called PPSs 
(for Positively Prime Solution, symmetrically there are NPSs). Note that these restrictive 



solutions have been introduced independently by Kirousis et al. in |30[ under the terminology 
of locally maximal solutions and single-flip technique. 



Definition 1.1 A Positively Prime Solution (PPS) A of a SAT formula F is a solution of 
F such that no variable of F with the value 1 under A can be singly inverted (or switched) 
to unless at least one of the clauses of F becomes unsatisfied, that is, the new assignment 
is no longer a solution of F. 



Any satisfiable formula has a PPS, but some have very many: they provide extremely useful, 
yet somewhat limited restriction. A means to enhance this is unbalancing, which we now 
introduce on an intuitive level. 

When enumerating formulae with a view to computing an expectation, we usually count 
as different some formulae (very many, in fact) which really are the same from the point 
of view of satisfiability. This happens in more than one way. Some formulae differ from 
each other by a permutation on the set of clauses, or on the set of variables; these, however, 
are fairly transparent. What concerns us here are formulae deduced from one another by 
renaming certain variables, in the restricted (and usual) sense of inverting the signs of 
all their occurrences. Their significance to us stems from the fact that unlike those just 
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mentioned, they are not neutral with respect to PPSs. Consider, e.g., a pure variable (one 
which has all of its occurrences of the same sign). This sign is indifferent as far as ordinary 
solutions are concerned, but a solution in which a negated pure variable takes the value 1 
cannot be a PPS, while an unnegated pure variable has the best chances that many of the 
solutions giving it the value 1 will be PPSs. Similarly, a variable with more positive than 
negative instances is likely to kill fewer PPSs than the reverse. Therefore, of two formulae 
which differ only by the systematic inversion of some variables, the one with more negatively 
unbalanced variables may be assumed to have fewer PPSs. To be precise, call two formulae 
equivalent if one can be obtained from the other by renaming certain variables. Clearly, 
this is indeed an equivalence relation 1Z on the set Q(n, c) of 3-SAT formulae on n variables 
with cn clauses; 1Z results in the partitioning of formulae with respect to equivalence modulo 
variable renaming, and the cardinality of the equivalence class of a formula F is 2 Vu<yF \ 
where v u (F) is the number of unbalanced variables in F (variables having unequal numbers 
of positive and negative occurrences; note that an absent variable is, by definition, balanced). 

Since negatively unbalanced variables tend to inhibit PPSs, we have a good candidate for the 
formula with the fewest PPSs within each equivalence class C: namely, the totally unbalanced 
representative F~ obtained from any F e C by renaming exactly those variables which have 
more positive than negative occurrences. Moreover we have an easy criterion for a formula 
to be the totally unbalanced representative of a typical formula, namely that the proportion 
of variables having x total, p positive occurrences be 2k XjP if x > p; k XjP if x = 2p; and if 
x < 2p. So these representatives, or, as we shall say, the typical totally unbalanced formulae 
(by abuse of language, since they are actually not typical at all) can be defined just like the 
typical formulae, only using instead of the n x , p s their totally unbalanced counterparts, the 
k X}P s defined by : 

{2k XjP if x > 2p 
k x , p if x = 2p (1) 
if x < 2p 

All equivalence classes of such representatives have the same number of elements, namely: 

Calculations with typical totally unbalanced formulae are no harder than with plain and 
ordinary typical formulae, in fact they are much the same with k, replacing k, and the 
specifics of the distribution tend to intervene only in the very last stages. Computing (via a 
simple technical device) the expected number of PPSs of the former rather than the latter, 
then multiplying by the above size of equivalence clases, we get what amounts to a 'skewed' 
expectation where each formula is counted, not according to its own number of PPSs, but 
to that of its representative with fewest PPSs. It is this, combined with the gain already 
inherent in the restriction to structured formulae per se, that affords us a very significant 
improvement on the upper bound of 4.643 resulting from the expectation of PPSs alone 



Before proceeding, we have to take account of some practical remarks raised by the foregoing 
considerations, (i) The K XtP s or the k x , p s constitute an infinite family and are all 7^ 0, while 
a formula has finite length; (ii) The proportions v x ^ p of variables having x total, p positive 
occurrences in a formula F G Q(c, n) must verify Y v x ^ p = 1 and Yl xv x,p = 3c = A, where 
the sums are in effect finite, while the equalities k x ^ p = 1 and Y XK x,p = A only apply with 
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infinite sums (series); (iii) The irrational, so they cannot be exact proportions even 

for special values of n. Thus, in order to derive a rigorous argument, we define what we call 
formulae obeying a given distribution of signed occurrences to a specified approximation : 



Definition 1.2 Let H = (£ xp )o<p< x be a family of nonnegative real numbers satisfiying the 
relations Y^Lo 52p=o £x,p = ^ an ^ S^Lo Sp=o x £x,p = ^- Given a real e > and an integer 
Xmax, a formula F G Q(n,c) is said to obey the distribution H to the accuracy (e,x max ) iff 
for < p < x < x max , the number of variables having x occurrences in F , p of which are 
positive, lies between (£ xp — £)n and (£ x + e)n. The set of formulae in Q(n,c) obeying H to 
the accuracy (e,x max ) will be denoted by !F(E, e, x max , n, c) . 



The term 'typical formula' will sometimes be used loosely to indicate a formula which obeys 
the distribution (k XiP ) to the accuracy (e, x max ) for some (large) x max and some (small) e. 

Henceforth the distributions of the k XiP 's and of the k x , p s (corresponding, of course, to some 
value of A = 3c) will be denoted by S and H Q , respectively Also, when the context makes the 
various parameters clear, we will often use the abbreviated notation E[PPS] for the expected 
number of PPSs of formulae drawn uniformly from J-(E,, e, x max , n, c). Strictly speaking, a 
direct calculation of the expectation of PPSs of typical totally unbalanced formulae would 
involve an awkward change of probability space. The same end result can be achieved much 
more conveniently by introducing an ad hoc r.v. on the original probability space Q(n,c), 
then linking its expectation to the probability of satisfiability: 



Proposition 1.2 Define the r.v. X niEiXmaXjC on Q(n,c) by: 

Y (p\-S 2 n ^>^xPPS(F) if F eF(Z ,e,x maxi n,c) 

"' £ ' Ira ^' cl J "| otherwise 

and set p = p Xmax = Y.2 P >x max K ^ A = A *w = V 2 (x max /2 + 1) . //, for some integer 
Xmax and some e > 0, 2( p+eA ) n .'E[X njejXmaXiC ] tends to as n — > oo ; then so does Pr„ jC (SAT). 



(Remark: It will be clear from the proof that this remains true if instead of PPSs we use 
any class of solutions such that any satisfiable formula possesses at least one solution in this 
class, e.g. prime implicants 0, 'double flips' [|30|j .) 

The rest of our plan will be to compute an explicit expression of ^>[X n ^ Xmax ^ as sums of 
combinatorial terms, then an asymptotic exponential upper bound of this expectation. This 
will be obtained as a function of values of parameters satisfying a system of equations, which 
will be reduced to two equations in two unknowns. Careful study of these equations, coupled 
with numerical calculations, will show that for S = c = 4.506, and appropriate values 

of x max and e, +eAa; — ) n E[X n<£ , Xmax<c \ tends to as n -> oo. 
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2 Basic structural results on random 3-SAT formulae 



We have first to prove Lemma |1 . 1| . Here the classical limit theorems of probability do not 
apply, and some form of large-deviation inequality has to be used. One method is to first 
obtain the expectation of then apply the method of bounded differences (see, 



e.g., P3[ , pp. 16, 221). Or, a proof using Poissonization may be of independent interest, 
giving stronger bounds, so we include a detailed one in Appendix A. Interestingly, Lemma 
|1 . 1| , which is all we need, uses the full power of neither approach. 

The quantity \{(x,p) : < p < x < x max } \ = [x max + l)(x max + 2)/2 is encountered repeat- 
edly in the sequel, we denote it D (x max ) or simply D. 



Proof of Proposition |1.2j . With the equivalence relation 1Z as in Section |1.2| , and 1Z 

induced by 71 on JF(S , f, x max , n, c), the quotient (canonical) map JF(H , |, x max , n, c) — > 
^-"(S , f, x maX} n, c)/lZ maps F to the (class of the) formula F~ obtained by renaming all 
variables of F having more positive than negative occurrences. 

Recall that u XtP (F) denotes the proportion of variables in a formula F G Q(n, c) having x 
total, p positive occurrences. Then : 

{u x , P (F) + u X)X _ p {F) if x > 2p 
u x , p {F) if x = 2p 

if x < 2p 

A single F~ G ^(H , f, x max , n, c) /1Z may come from at most 2 Vu ^ F ) formulae (not all 
necessarily in JF(Ho, §, x max , n, c)). Taking into account that if x > 2p, we have H x ^ p = 
f^x,p ~\~ f^x,x-p (because k X)P = n x , x - p ), we have: 



I 



\u x ,p(F) - k X)P \ + \u X)X - p (F) - K x<x - P \ <§+§=£ if x> 2p 
\^x, P {F) - k XjP \ < | < e if x = 2p 
if x < 2p, 



so that 



F(E , Lx max ,n, c)/TZ\ < Lf(S , £, x maa; , n, c) . Further, 



v u {F- 



n 



< 



0<2p<X 0<2p<Xmax 

i - ^2 K2 P'P + ( 

0^ 2p"2~-x max 

^x,p + ^ f^2p,p + 

0<2p<x 2p> 



~T~ / 2 



Xmax _|_ i | ^ 

~1T / 2 



Therefore, since K 2 p,p = K 2 P , P , 

|^(S ,£,x mQ:!; ,n,c)| < 2 2nE »<^^x2K'"» + ' , ^) n x 



max / X 



•F y — '0) 2 ' ^"niaxi "^i ^) I'F- 



( — '0; £-1 Xmax ; c) 
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Remark. Our bound on v u (F~) might at first sight seem too loose, since, instead of allowing 
all unbalanced variables to be renamed in any combination, we should really pick half of 
each group of K x>p .n and rename only these. Actually, the two bounds do not differ in their 
exponential orders of growth as n — > oo. 
Note that F is satisfiable iff F~ is. So, 



■F(Ho, -, x max , n, c) n SAT(n, c) 



We are now able to show that if 2 (p+eA)n x E[X 
bility of satisfiability. Indeed: 

\SAT{n,c)\ 



< 2 2n ^°< 2 p< x Kx ' p x 2 eAn 

x J r (E ,e,x max ,n,c)C]SAT(n,c) 

tends to 0, then so does the proba- 



^ i£ \X max )C| 



Pr n , c (SAT) 



< 



\Q(n,c)\ 

|^(S , ^,x max ,n, c) n SAT(n,c)\ 



E 

0<P<a;<a:n 



|fi(n,c)| 

|{F€&4r(rc,c): KpCF)-^! >|}| 
|fi(n,c)| 



By Lemma |0, each of the D (x max ) terms of the last sum tends to as n — > oo, hence: 



Pr nc (SAT) < 



2 2n T,0<2p<x K x,P x 2( eA +p) n X 



^(Hqj £ > ^max, n, c) n SAT(n, c) 



|fi(7l, C )| 



+ o(l] 



So, 



Pr(5Ar) < x 

\n(n,c)\ 



Now, since any satisfiable formula has at least one PPS, we can write: 

o(eA+p)n 
Pr(SVLT) < rr >< 



\Q(n,c)\ 

2(eA+p)n 



2 2n ^o< 2p<x k*,v x PPS(F) + o(l) 



F£F(Bo,e,x max ,n,c) 



E - V - -,..,(/••! + 0(1) = 2<** A >» X nXn,e, Xmax A + 0(1) 



Fe.F(Eo,£,Xma!D,ri,c) 



3 Combinatorial analysis of the expectation. 

In order to estimate the expected number of PPSs of formulae in JF(So, e, x max ,n, c), we shall 
first compute the number of such formulae having fixed values of the proportions uu XjP (F) for 
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< p < x < x max . It will be convenient to characterize these formulae as associated with 
an element of the set Q £jXmax ,n,c Q Q D of vectors 6 = {6 x , p )o<p< x < Xmax such that (with the 
notation J n ={0,^,^,..., l}, which applies throughout the sequel): 

(i) @x,p £ In-, <J p <J x <J x max: 

Xmax X 

(ii) EE^ ^ !; 

x=0 p=0 

(hi) l^a;,p — K a;,p| < < p < :r < :r ma:E ; 

It is clear that a formula F is in JF(S , e, x max , n, c) iff the vector (^,p(F))o<p<a;<x ma:i; is in 
®e,x max ,n,c- For G e ,x maa; ,n, c , we denote by ^(8) the subset of ^(Ho, e, x max , n, c) consisting 
of those formulae F such that for < p < x < x max , u XyP (F) = 9 XjP . We are able to focus 
on the number of elements of F{0) mainly because, as the following lemma shows, the 
relatively small (i.e. polynomial) size of Q £tXmax ,n,c means that, as far as exponential orders 
are concerned, it makes no real difference whether 6 is kept fixed or allowed to vary within 

Lemma 3.1 \@ £ ,x max ,n,c\ < (2enf . 

Proof. If the vector 6 is in Q £ ,x max ,n,c, then for < p < x < x max , Ox, P n is an integer 
comprised between (k x jP — e)n and (k I]P + e)n, so there are at most 2en possible values for 

@x,p- ^ 

3.2 Counting formulae with a given PPS and fixed proportions of 
variables having given numbers of occurrences 

For some given e and x max , we now consider a fixed vector 6 G Q £ ,x max ,n,c and a truth 
value assignment A G {0, l} n , identified with the subset A" 1 ^) of the set of variables V n . 
Let JF(0,„4) be the set of formulae F G f2(n, c ) such that A is a PPS of F and that for 
< p < x < x max , uj x , p (F) = 6 x , p . Thus: 

Proposition 3.2 E (X n ^ Xmax , c ) = ^] ^ E E \H0,A)\. 

' eee,, Xmax , n ,cAe{o,i} n 

Our next goal is to estimate the size of T{Q, A} for a fixed 6 G Q £ ,x max ,n,c and *4 G {0, l} ra . 
Abundant use will be made of the quantities r = T(0,x max ) = 1 — Eo<p<s<x moa: ^ x >p and 
a = a(6,x max ) = A — J2o<p<x<x max x @x,p- T is, of course, nonnegative by definition; for any 
F G T{0, A), r represents the proportion of variables having more than x max occurrences in 
F. Also, a is nonnegative, since for F G T{Q, A), it represents the proportion of literals in 
F (among the total An) whose underlying variables have more than x max occurrences. 
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Given a formula F and a truth assignment A, we say that a clause of F is of type (A,j) 
(0 < j < 3) iff it has j nonzero literals under A. To say that A satisfies F means that F 
has no clauses of type {A, 0). 

Now suppose A is a PPS of F e „4), let t> be one of the variables having x total, p 
positive occurrences in F, and q its number of occurrences in type- (.4., 1) clauses as the unique 
satisfying literal. If v has value 1 under A, then q = p — j for some j with < j < p — 1; 
excluding j = p expresses exactly that .4 is a PPS. If t> has value under A, then q = j —p 
for some j with p < j < x. Since the two cases cover exactly once each possible j between 
and x, they can be conveniently coalesced by saying that for any variable there is a unique 
j with < j < x, such that \p — j\ of its occurrences are in type- (.4, 1) clauses, the value 
of the variable under A being then automatically determined by the sign of p — j. It is 
1/2 (1 + (p — j) j \p — j\) if j 7^ p and by convention if j = p. We call such a variable a 
variable of type (A, x,p,j), and thus to say that A is a PPS of F E T{6, A) means exactly 
that every variable is of type (.4, x, p, j) for some x, p and j with < p < x and < j < x. In 
our enumerations, however, we will only impose this condition for x < x max . The variables 
with more than x max occurrences, or heavy variables, will be considered unconstrained, 
and we will broadly overestimate the number of corresponding choices. If our expectation 
calculated by excess tends to 0, so does the true expectation. 

Recall that we use the notation I n = {0, 1/n, 2/n, 1 — 1/n, 1} . Given the vector 6 &® e ,x m ax,n,c 
the assignment A, and rationals 7 X , 7 2 , 73 e 1 cn and fx XjPj G I 6xiP n (0 < p, j < x < x max ) , we 
proceed to count the formulae in T{6, A) 

• consisting of 7jCn clauses of type i, i — 1, 2, 3, and 

such that the number of variables of type (A,x,p,j) is fi XyP j0 XtP n for < p, j < x < 



x 



max- 



We assume, of course, 7i + 7 2 + 73 = 1 an d YTj=o flx&j = 1 for < p, j < a; < x max . Let 
Z (0, 7, fi, n, c) be the number of such formulae. 

The empty formula template A Cj „ contains An cells, with A = 3c. We first choose those which 
will correspond to each type of clause, and within each group, those to be filled with literals 
of value 1. This can be done in A n (7, c) ways, where 

A n ( 7 c \ = ( cn ) ! 3(71+72)™ 

( 7l cn)! (7 2 cn)! (7 3 cn)! 

Second, among the n variables we choose, for < p < x < x max , the 6 x>p n which will 
have x total, p positive occurrences, and among these the fi XyP j0 x ^ p n which will be of type 
(A,x,p,j). Recall that given fJ> xp j the values under A of the (i XiP j0 x ^ p n corresponding vari- 
ables are automatically determined. We complete the specification of A by choosing the 
values of the remaining rn heavy variables (recall r = 1 — J2o< p < x < Xmax 9x,p)- The number 
of possibilities is: 

B n {0,n)-2 -— -- [I 



i Tn Y- Ylo< P <x<x max (0x,P n V- < p < x < Xmax nj=o {Vx,p,j9x,pn)\ 
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Finally, we effectively fill the cells with the variables of different types. Let M n (0, 7, fx) 
be the number of ways to do this and obtain a formula in J-{0, A) meeting our require- 
ments. We start with the heavy variables, which must have an occurrences (recall o = 
1 — J2o<p<x<x max x @x,p) ■ We assign their occurrences to cells, which automatically deter- 
mines the sign of each occurrence, having already completely specified A on the one hand, 
and the contents, or 1, of each cell, on the other. We bound the ways to assign all the 
occurrences of heavy variables to cells by the quantity 



r)(0,n, c) 



An 



an 



\rn 



The 7 x cn clauses of type (A, 1) contain 7 x An cells, 7 x cn of which are already reserved for 
nonzero literals. Among these, some already contain occurrences of heavy variables. Let 
their number be ain; this is not an independent parameter since 



£1 = Tic - Yl dx >P \ p ~ i\v x , P ,j- 

0<p<x< 0<j<x 



(2) 



There remain 7^ — d\ cells to be filled in this group. These are filled with the p — j 
unnegated occurrences of variables of type (A, x,p, j) with < j < p — 1 and the j — p 
negated occurrences of variables of type (A,x,p,j) with p < j < x. Thus the number of 
ways to fill the 7 x c — a x cells is : 



Mi 



[(7ic-3i)ra]! 



n 



0<p<a;< •' i'l.a.l 



Next, we fill the cells already reserved for nonzero literals, which do not pertain to clauses 
of type (^4, 1). It will be convenient to introduce the normalized nonzero spread of F under 
A, namely 

^ = 1/3 (7 1 + 27 2 + 37 3 ). (3) 

Among the Xvfm cells in total which are to receive nonzero literals, let o\n ones contain 
occurrences of heavy variables. 07, like a±, is a known quantity: 



@x,p 



0<p<x<x max 



. o<i<p-i 



(X 



p) E 



p<j<x 



(4) 



For the \ipn — (o"i — <ti) n — r ) 1 cn remaining cells in this group, we have available, for each 
variable of type (^4, x,p,j) with < j < p — 1, the p unnegated occurrences less p — j already 
placed; and if the type is (^4, x,p,j) with p < j < x, the x — p negated occurrences less j — p 
already placed. Thus, the number of ways to do the assignement is 



M 2 = 



- 7 x c- a 1 + a 1 )n]\ 



n 



0<p<a;< 



Lastly, we deal with the A (1 — ip) n cells reserved for null literals, of which (a — <ji) n are al- 
ready filled. For the remaining ones, we have x—p occurrences of variables of type (^4, x, p, j) 
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with < j < p — 1, and p occurrences if p < j < x. So, we can fill them in ways, where 

{[A (1--0)- <r + <Ji]n}\ 



o<p<x<x max [V- 1 - yj- t>- 
To sum up, M n (6, 7, fi) < M1M2M3 t] (0, n, c) , so that 

Z (0, 7, /x, n, c) < A n (7, c) B n (0, /x) M1M2M3 v (0, n, c) . (5) 

3.3 The expectation. 

It follows from the preceding discussion, and the definition of ^(6, A), that, setting 
Jn = \Ji< k < n h, we have 

£ \F(0,A)\<r ] (6,n,c)J2 A nhiC)J2 B n(0,»)M 1 M 2 Mz, (6) 

.4e{o,i} n 7e/ C n msJ™ 

where the summation is under the constraints 

7i + 7 2 + 7s = 1 (7) 

and 

X 

^2 Vx, P ,j = 1 > 0<p<x< x max , (8) 

and where 0\ and cti are expressed in M.\,M. 2 and .M3 as functions of 7 and /x, see ([|) and 
(I)- 

We now introduce a modified form of (El) which will be convenient later. We set 



(%x,p 



(^) 

o<i<p-i 



the proportion of variables with x total, p positive occurrences having the value 1 under A. 
Taking account of (||), can be written using only the a I]P 's: 

\iP-a 1 = K(0)- Yl H x>p (0)a x>p , where (10) 

0<p<x< 

K (0) = £ ( x ~ p } ® x > p and ' for - p - x - Xmax : Hx 'P ^ = 2 P) 8x,p- 
From Lemma |37l], Proposition [3?2| , and @, we get, for any fixed G 6 £ a 



E(X n>£>Xmax:C ) < 2 (2^)^(0,^) £ A n ( 7 ,c) £ B„ (0, **) M l M 2 M,, 

(11) 

subject again to (0) and fl§|). 
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4 Asymptotics. 



4.1 Bound for the exponential order. 



Still for a fixed 0, we now bound the general term of (||), using a standard inequality for 
multinomial coefficients: 

/ T \ 

< 



T\ r 2 ... r s 

which gives first, taking account of (^): 

^n(7,c) 1/n < 



[7?7? (37 3 ) 73 ] C 



further, 



-r T II €7 II I II Aft? 



Mi ,n < 



n 

0<p<a;<a;„ 



p—1 x 
j=0 j=p 



next, bearing in mind that, by A^ — cri does not depend on n: 

L(AV» - 7lC - CTl + <Tx) -\ 



M 2 1/n < 





~p-l 


X 


n 






0<p<x<x max 


_i=o 


j=p 



and, using 



3 EI [(z -p)!"^! 1 -"^]^ 

0<p<i<3; mal 



So, writing (p - j)!j! = p!/ (?) and (j - p)! (a; - j)! = (x - p)!/ g_J) : 



A4i 1/ri .M 2 1/n < (Tic-ai) 71 ^ 1 (A^-TiC-ax + ^i) 



n 



Xtp— 7iC— CT1+CT1 



n 

0<p<x<x maa; 



n (•)"'" fi (; ;;)""• 

i=o j=p 



(x - p)! 1 -^]' 



Bounding the sum in (0) by its maximum term times the number of terms |J cn | \ J n \ with 
| J n \ < n (n + 3) /2, we get, after some simplification and whenever c < 5: 
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< 



3 C (An) 



A-(7 



a e 



n 



^ (2/r) T 
[p! (rr-p)W„] fl - 



(6n 3 ) 



3\ rt 



max 



( 7l -CTi/c) 



X 



7,-e/cn, ^, PJ 6J„ [7^7^ (37 3 ) 73 ]' 
[3 (1 - V) + ^/c - rT/c] A(1 -^ +ffl - ff (3^ - 71 - *i/c + ^/c^" 71 ^ 1 ^ 1 



(12) 



n 



0<p<a;<a: ■m a.r 



X / \ U 

11 ■ \ ^ x 



n 



where the maximum is subject to all the above constraints (0) and @ ; and where 

g) if 0<j<p-l, 



x. 



Since (p = it may be observed that h X:P j is the number of ways to select , among the 

literals with value 1 under A associated with a given variable of type (A,x,p,j) (assumed 
distinguishable), those (if any) destined to prevent the flipping of that variable. 



Finally, still for a fixed value of 6 e ® e ,x max ,n,c, we can extend the max in the above estimate 
to arbitrary real values of the 7, 's and of the n xp jS in [0, 1], subject to the stated constraints. 



4.2 A priori bounds on the main parameters. 

We are about to replace our estimate (0) by one that is uniformly valid for all 6 69 £iInwi)B|C , 
and to that end will require that c be bounded from above and below, and will have to check 
some inequalities involving c, e and x max . To give our estimate in reasonable generality, 
we assume < c m j n < c < c max with, for the moment, only a mild and fairly arbitrary 
constraint on c m i n and c m a X , say 3 < c m i n < c max < 5; correspondingly, A is restricted 
to [Xmin, Xmax] with 9 < A m j n < X max < 15. Later, we will be more specific and impose 

Cmin C ^max 4.506. 

For such an interval [c min , c max \, it is easy, by elementary expectation calculations, to de- 
termine intervals [j lmin , llmax] , [72mm- llmax] > [iSmim ISmax] , bPmin, ^max] , SUch that for C G 

[cm,in,c ma x], the probability that a formula in Q(n,c) has a solution with at least one of 
7 l5 7 2 , 7 3 ip falling outside the corresponding range is always exponentially small. For exam- 
ple, for [c m i n , c max ] C [3, 5] we can take these intervals to be [0.21, 0.65] , [0.21, 0.65] , [0.017, 0.32] , 
and [0.47, 0.68] , respectively. 

This means that in investigating, by more sophisticated means, the probability that a formula 
in Q(n, c) is satisfiable, we need only consider solutions, or indeed PPSs, with 7a , 7 3 and 
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ijj in their respective intervals. Thus, we can define the r.v. X n:£:Xmax>c with these more 
restricted PPSs, and T{d, A) similarly. All that we have said up to now goes over, notably 



Propositions 1.2 and |3.2|; and (O) holds, with the maximum subject to these additional 



restrictions, viz 



Tj ^ [TjmmiTj'maj;] 5 3 1; 2, 3, 1p G [V'mijn V'marr] ■ (^) 



Henceforth we assume these additional constraints throughout; we also fix e = 10 15 and 

%m.ax 56. 



4.3 The 6-free estimate. 

Deriving from (|i~2"D , at controllably small cost, an estimate where the fixed but unknown 
# x „'s are replaced by the known matter of easy but tedious calculations which 

we will only sketch. Anyway, one could get by with coarser bounds than we give by simply 
choosing a smaller e and larger x max . Note that we have relied on x max being even to simplify 
some of the calculations slightly. 

One somewhat delicate point is how to deal with the numerous quantities of the form x x 
where x is unknown but 'near' some known y. We use the following very elementary lemma, 
not sharp but sufficient, so not worth improving. 

Lemma 4.1 Let L be the function 77 h- > (2r])~ 2v on R + . Then whenever x,y,r] are positive 
reals with rj < 0.05, \x — y\ < rj, and x < 30, we have L < y y jx x < L{rf). 

Proof. (outline) For x < 30, we study the function f x (h) — (x + h) x+h on the interval 
(x) = [max (—x, —rj) ,77], showing that whenever \h\ < 77, f x (h) jx x falls between L (r])~ 
and L (rj). This is done by elementary monotony considerations, distinguishing the two cases 
\h\ > x and \h\ < x, the second being split into two subcases where the double inequality 
1/e — 7] < x < 1/e + rj either holds or not. ■ 

4.3.1 Eliminating the ct's and r, and withdrawing 6 from ip and 7. 

From t < 1 - Yjx, p Kx,p + Y,x, P - «x,p|, we get, in terms of D = D (x max ), 

t (Q ,x max , A) R\ (5, x max } 
uniformly for all G Q E ,x max ,n,c and all relevant A, where 

R\ (5, x max ) —. - . -|- sD {x max } . 

K^max 1 J- J- 
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Similarly, from a < ^A — J2 X P XK x,pj + J2 xp x l^,p — K x, P \ we obtain, in terms of P2 (£) 
£ (£ + 1) (£ + 2) /3 that a (0 

i x max 1 

A) < i? 2 (e j %max) ; again uniformly for all G ©£,x ma;r ,r. 

and relevant A, where 



i?2 x max 1 



.2 . 



As for <7i and ai, we simply use <7i < <Xi < a. 



Turning to ip and 7, from (Q) and (^) there are natural candidates for the #-free versions of 
7 X and ip, namely 



0<p<x<x mal 



(p-j)vx, P ,j+ Yl u-p)^ 

.o<j<p— 1 p<j<x 



tE 



o<i<p-i 

or, equivalently so long as @ holds, cf. (|T0|): 



^,pj + (x-p) ^ 



z,pj 



p<j<x 



(14) 



(15) 



1 

A 



E 

0<2p<a;<x rl 



(16) 



with 



If = (ar - p) /ta-j, and if SiP = (x - 2p) k x>p , 

0<2p<X<Xmax 

which is the definition of <fi we adopt, since it will be helpful subsequently. 



Observing that from (|7|) and (|3|), we have 7 2 = 3(1 — 
define the 0-free versions of 7 2 and 7 3 as respectively 



- 2 7l and 73 = 7l - 2 + 
p 2 = 3 (1 - V) - 2/3j and /3 3 = ^ - 2 + 3^. 



we 



(17) 



Using the above bounds on a, it is easy to estimate \<j> — ip\ , \/3j — jj \ , and the worst- 
case error incurred in replacing (ft by ip and the /3_-'s by the 7-'s in (|I2]). Setting P3 (£) = 
£(£ + 2) (2f + 3) /Sand 



-£^3 v-i x max) 



A. 



.1 r 



"I - [-f*2 (^marr) "I - -f*3 (^nwa)] 



I A • 



we find 

; x max 

), 1^-71 1 <3i2 3 (e,ar. 

|/3 2 - 7 2 | <9i? 3 ( ) , |/3 3 — 7 3 | < 6-R3 (e, x. 

and therefore the constraints (|13|) imply the following ones on : 



max J 1 
max) 1 



mini V max} 



Pj ^ [^j mi ™' @j max] ? J I? 2, 3; G 

where /3 lmin = 7 lmin - 3i? 3 (e, ^m«) , /?i waa: = 7imax + 3# 3 (e, aw) , and so on. 



18) 



(19) 
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Since R3 (e,x max ) < 1.035 10~ 9 , from (p~8| ) and ( ]T9"| ) we see that all of /3 2 , /3 3 , 0, 30 — 
and 3(1 — 0) are positive, < 30, and at a distance of less than 0.05 from the corresponding 
^-dependent quantities. This allows us, using also a < Ri (£,x max ) < 1.54 10~ 8 , repeatedly 



to apply Lemma 4.1 and get: 



< G 1 (e,x max )3 c (\n) x (j-^J x 



(6n 3 ) n max 



(30 _ ^J^i [3 (1 _ 0]3<W) (3/ g s) -A 



n 

0<P<a;<a;n 



p! (x-p)!^ f[ 



where 



Gi (e, z r , 



2^ (E,W) I {Ri (e, x max )) L (R 2 (e, x max )) 



18 e 



i=o 



R2(e,Xmax) 



X 



-R2 (g, 3V, 

6 



I - P3 ('I'max ) ) ( < ^3 (•fmai, 



A, 



A T 



X 



[(L (3/23 (g, x max ))f L (9R 3 (e, aw)) L (6i? 3 (e, ^)) 3^^""-)]"™*" 
and the max is subject to (|§P and flOp. 



4.3.2 Removing from the powers-and-factorials product. 

Since the only real difficulty is, possibly, getting started on the right path, we only indicate 
how we break down the error incurred from replacing the 8 x ^s by the flap's, into three 
factors A, B, C, to be estimated separately. We have 



< 



ABC 



n 

0<p<x<x„ 



p! (x-p)W x , P U (v^ 



n 

0<2p<x<x maI 



p\ (x-p)\k X!P n 



3=0 



[note the change of domain for the index p), with 



.4 



B 



C 



n 

l<p<X<X m ax, 2p>X 

n 

0<2p<X<X m ax 



p\ (x -p) ] -0 XjP Y[ 



3=0 



x,p,3 



x;p 



n 

0<2p<x< 



p! (X -p)!/«x,p JJ ( 

j=0 V ft a;',P,i 
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(20) 



We find A<G A (e, x max ) , B <G B (e, x max ) , C < G c (e, x max , A) , with 
G A (e,x max ) = iT2^^ +2x — l >L ( e ) +3) 

G B (e, x max ) = \{ (1 + e) L^J < (1 + e)^ 

2 = 

and, using the fact that 

for all A £ [AfTjjjj, A ma ^] , x max 
Gc (e, Xmax, A) = 

( rp _i_ 1 \ 2 O ^T^ - (^max + l)(^max 7) J O^mai +4 

( " I 

Observing that since 

Xmax A maa; , 

e~ A (A) 2 "™"* increases with A within our range of interest, and setting G2 (£,x max ) = 

G A (e, x max ) G B (e, x maa; ) G c (e, s m(B , A ma:c ) , we conclude that, for any c in our chosen range 

and the max being again subject to @ and (|19|): 



2A-log2 
log A — log 2 ' 



4 



(21) 



£ 1^.^)1 

•Ae{o,i} n 



< Gi( 



3 C (An) A ( ) (6n 3 ) ; x (22) 



max 



(30 _ [3 (1 _ ^(W) ^-A ( 3/ g 8 )-/?. J* 



n 

0<2p<X<X maa : 



p!(x-p)!«^n 



Note that the fi xp ^s with 2p > x are now irrelevant, having vanished from the bound (they 
do not actually figure in /3 2 , /3 3 , or 0), and thus the equality constraints under which we 
now perform the maximisation are just (§) for < 2p < x < x max . 



5 Maximization. 



By (|19l), for c within our range, the max on the r.h.s. of (22) may be restricted to vectors 
fj, EU, where 

M = ftl QPlmiw Plmaxi) ^ 02 (] @2miw fllmax D ^ 03 (Iftmiiii 03max D ^ (]0min> 0maa; [) 5 

is an open subset of where N = N (x max ) = 1/24 (x ma:r + 2) {^x 2 max + I3x max + 12) 
(recall that we have dropped the irrelevant variables [i xp j with p > x/2). For the moment, 
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we do not further specify these reals; we do so later when restricting c to a single value. (We 
do already assume 30 min > f3 lmax though). For now, (p2|) leads to the following problem of 
constrained maximization: 



><X<X max 7=0 \^,p,j 



max \ ^/.few 10 ? — (23) 
+c {(30 - log (30 - ft) + [3(1- 0)] log [3 (1 - 0)] - /? 2 log/3 2 - /3 3 log (3/5 3 )} 
subject to the constraints (||) which we rewrite as 

X 

C x , p = 0, where = -l + y~]n XtPd . (24) 

i=o 

This is not yet amenable to traditional differential techniques, since the set n U is not 
open. However, it is not difficult to bar out the vectors on the boundary as candidates for 
global, indeed even local, maximizers, as we now proceed to do. 



Let us compute the gradient of the function of \x maximized in (^3|), say f\ (/x). For the 
quantity inside the braces, using ( ]T7| ) and (|7p, and setting 

U 9(1-^)^3 and v=1 , Pi _ (/? 1 + 60-3) 2 

(30 - p x ) (3 2 3 (30 - f5 x ) (3 3 3/? 3 (30 - ft) ' 

we obtain for V/i : 

3 log 3 3 fl-0V V0 " log (30 ~~ ^ VPl ~ log ^- V ^ - log (3/5 3) • V ^s - V/?! - Vft - Vft 
= -31ogf/.V0 + log(V- l).Vft, 

so, taking the (x,p, j)-coordinate: 



dfi j ".■-./> 



ft 



log _ i + (a; _ 2 P ) log Z7+ (p — j) log (V - 1) 
Iog^-1 + 0'-P) log(^-l) 



0<j<p-l; 
p < j < x. 

(25) 



With this knowledge, we can establish: 



Lemma 5.1 iVo feasible vector it (i.e. fi C\U satisfying having at least one null 

coordinate can be a local maximizer for the problem ([2Bj). 

Proof. Choose ji and j 2 such that fJ> x ,pji = an d ^x,p,h ^ ^' an< ^ cons ider the real-valued 
function /* defined on ~j0,fj, x ^ [ by /* (£) = fi (/-^), where fx^ differs from \x only in the 
(x,p, ji) and (x,p,j 2 ) coordinates, the former being equal to £ and the latter to l^x,p,j 2 ~ £• 
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Of course, for sufficiently small £, /x^ is still feasible, so it suffices to show that is not a 
local maximum for /*. But, using (|25|) : 



J_dfl 



x.p 



dfx ( x dfx f v 
— W " 777, W 



PJ2 



i (( g ^ _ bg // -""- , + /?,.,,,,„ log [c/ + s PJ1 ,, 2 log [y fo) - 1] , 



where l-R^j^l < ^mm and l^p^al < i ma3; . As £ — > 0+, the first term on the right tends 
to +oo, the second remains bounded, while, since fx^ is feasible, 

9 (1 ^max) fismin ^ j j I . . \ ^ 9 (f 0mm) @3max 



(30, 



< 



< 



and 



/3 2 

"2mm 



' ™ ~ (30 

< v M - 1 < 



Plmax) @2min 



3 2 



so the third and fourth terms stay finite too. All in all, dfl/dS, is seen to tend to +00 as 
£ — > 0+, so obviously (e.g., from the mean value formula) cannot be a local maximum for 

fx- ■ 



So, the set constraint in problem (^3|) may be replaced by /x E O, with O = ]0, +oo[ n U 
(an open subset of ~K N ), allowing a study of local optima by traditional differential methods. 
fi is readily seen to be bounded in O, since for x > 2p all h x>p j's are < 2 X ~ P ~ 1 ; thus, if some 
number M bounds from above all local maximizers in O, then f\ (/x) < M for any feasible 

Since all constraints are affine, we do not actually need any further constraint qualification 
such as linear independence of the gradients, though this is clearly the case. The classical 
method of Lagrange multipliers applies [34]: a necessary condition for optimality of f\ at 
some feasible *x* G O is stationarity in the sense that there exist real numbers A x>p , < 
2p < x < x max , such that at /x*: 



V/i + A *, P vc x , p = 0. 

0<2p<x<x max 



(26) 



Now take the (x, p, j) -co ordinate of (0), using (|25j): since 7^ whenever 2p < x, we get, 
for < j < p — 1 and p < j < x respectively: 



P-J 



hx,p,j © X P 



A., 



x,p 



K 



x.p 



l)U*- 2 *>(V-iy-> and fixpj = h x ^exp( 1 ^-l)(V-l 

(27) 



A, 



X'.p 



\3-P 



The Lagrange multipliers are determined by plugging this back into the constraints (f24|): 

'p\ ,„ iNP _j , (x-p 



1 



exp - i") [^- 2 p (yp - 1) + y*-*] . 
\ ^ x ,p J 



x3~V 
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Hence a necessary (and sufficient) condition for stationarity is that the N (x max ) unknowns 
/W XjPi j,0 < 2p < x < x max such that < j < x, satisfy the following system of N (x max ) 
equations: 



x,P,J 



( p ) — 

\jj U x 
( X .- p ) 



U x - 2 P(V~l) p - J 

(y-iy-p 

U x - 2 p ( VP - 1 ) + V x -p 



if < j < p - 1; 



if P < j < % 



(2? 



(where of course U and V are themselves fairly complicated functions of the /i's). Note also 



that for any solution /j,, the quantity a X)P = Y^ P =0 ~ Vx&j has the summation-free expression: 



a 



Jjx-1p yp 



1 



x.p 



jjx-2p (y P _ ]_) + yx-p 



and systematically equals for p = 0. Further, under the same conditions the coefficient of 



in /3i has the value 

c 



jjx-2p (y P _ 1) + 



/i=l ^ ' ;=o ^ 



x — p 



V - 1 p[/ x - 2 P1/P + {x-p) V x ~ p 
~ ~~y \Jx-2p (y v - 1) + V x ~p ' 

where again the summation in j has disappeared. The system (^) may seem hopeless at 
first sight. However, all unknowns can be extracted in terms of just two (affine) functions 
of themselves, and f3 l . This has the following consequence. Consider the system S* of 
N (xmax) + 2 equations in as many unknowns obtained by viewing cf> and j3 1 as two further 
unknowns, and ([Tj]) and (|14D as two additional equations. A solution of (|28| ) immediately 
gives one of S*, and conversely. Solving fl2"8p amounts to solving S* by trivially eliminating 
4> and f3 1 . But the property just stated means that there is a better way to solve S*, namely 
eliminating the /i's, leaving just 2 equations in 2 unknowns. So, viewing now U and V as 
functions of and P 1 only, we plug the r.h.s.'s of ( p8|) into ([16|), (|14D to obtain 



\(f) = K 



Pic 



l<2p<x<x mal 
V- 1 



V" 



E 



(x — 2p) 



1 



l+{$) X ~ 2p (l-V-r) 

pU x-2pyp + ^ _ ^x-p 



/"», 7 



(29) 



(30) 



0<2p<x<x ma:E v 7 

Having solved this in and (3 1 , we plug them into fl2~8|) and obtain the /i's. While still highly 
nonlinear, the system (|2~9"1 , |30| ) can, as we shall show, be rigorously analyzed. But what we 
certainly cannot do is to exploit convexity considerations as in [|n|| , |J: here the objective 
function is not concave. 
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6 The equations: analysis and numerical resolution. 



6.1 Preliminary transformations. 

In the sequel, ((f), Pi) will denote an arbitrary solution of (|29|, 0) or an equivalent system. 
Before we proceed, it will be helpful to rearrange some of the already obtained expressions 
in a more convenient form. 



6.1.1 The expectation revisited. 



Assume that we have a solution ((f), (3^) to the system ( [291 , |30|) an d that the corresponding 
parameters fj, do give rise to the (global) maximum appearing in (^). We show that our 
bound on the expectation simplifies to a formula where the parameters \x only intervene via 
the two quantities (f> and f3 \ (and U and V viewed as functions of these). 

First we have, in view of (£3) and taking into account (]8|), JI^), (|S|), and fliTf): 



n 

0<2p<X<Xn 



n v „ 

j =Q \"-a;,PJ 



jjK—\<t> (y _ yPic 



n [u x ~ 2 p (vp - 1) + v x -p] Kx - p 

0<2p<x<x max 



Call the inverse of the r.h.s. gi ((f), f3 x ), plug it back into ([22]), and modify ([TT]) accordingly, 
using (2sn) D l Tl < exp (2eD/e) : 

2£*>2 P "*>p exp (2sD/e) 



^(X n , £ ,x max ,cY /n < (6n 3 )^3 c (A)' 



n 

0<2p<x<a; rl 



[p!(x-p)!«, 1K " ,P 



x 



G 2 (e,x max )g 2 (<f>, Pi) 



(^-(3^-^ [3(l-0)] 3(w) 



(31) 



This is essentially the estimate that will serve in our numerical evaluations. It is possible 
further to transform it so that all exponents become fixed (i.e. independent of and Z^), 
but this, although noteworthy, will not be used here. 

Let us emphasize that the function of (f) and fii on the right of (|3l[ ) has little to do with 
the objective function in (|22|) (a function of fi, anyway). All we say is that it dominates 
E (X nj£:XmaxjC ) 1 ' n for some pair(s) (4>, (3 1) satisfying the system ( |29| , |30|), and our final bound 
will be valid for any such solution, without having to assume or prove uniqueness. Although 
it can be seen that (|29| , |30| ) actually characterizes stationary values of that function too, the 
(in fact unique) solution is not a maximum but a saddle point. 



6.1.2 A modified form of the second equation. 



The numerator of the fraction in the sum on the right-hand side of ( [30|) can be written 



pU x ~ 2p V f 



[x 



p) V x ~ p = (2p - x) U x - 2p (V p - l)+(x - p) [U x - 2p (V p - 1) + V x ~ p ] +pU 



x—2p 
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so that (130) also reads 



Pic 



V-l 
V 



+ £ 



^ H x , p a XtP + ^2 ( x - P) Kx ,v 

0<2p<X<X ma x 0<2p<X<X ma x 



pK 



2<2p<X<Xr, 



x >Pjjx-2p fyp — 1) + V x ~ p 



The second sum is by definition K, while if (P9j ) is verified, the first is if — A</>. Thus the 
system (|29|j30|) is equivalent to (^9[^2]), where (|32|) is as follows: 



Pic 



V-l 
V 



A0+ ^ K 



pU 



x—2p 



2<2p<x<x max 



x 'PjJx-2p (yp - 1) + V x ~ p 



(32) 



6.1.3 The monotone behaviour of U and V in each variable separately. 

Set [ viewing /3 2 , /3$ as functions /3 2 , /3 3 Of 0, /3 1; Cf. (|17| )1 D(/> t p l — [0mira> 'fimax] * [Amini filmax] ^ 

([ftmin.^]) n Aj (t/^smm) AjmoxD • Within our range of c, we can disregard pairs 



(4>,Pi) £ T^4>3 x i so we w iH limit our study of (|2~9| , |32"1) to . 

For fixed (f) and variable (3 l within X^,/? , then, Z7 increases (strictly) as the quotient of an 
increasing numerator by a decreasing denominator. 

For fixed (3 X , U increases (strictly) in (f> since (3 lmax < 1 implies 

dlogU _ -1 3 3 3 2/?! 6(1-^) 

d(f) -T30 + ^-30_^+^- (1 - 0) /? 2 + ^(30-^) > 

(As an unconstrained linear combination of the n xp ^s, f3 l does reach values > 1.) 

For fixed <f), V — 1 has a decreasing numerator, while the denominator, three times a product 
of factors with a constant sum, increases until j3 1 reaches j3 1M such that (3 3 = 3<p — 
however, (3 1M = 1 > j3 lmax , so V decreases on T>^p v 

For fixed fi x , V — 1 also decreases owing to a decreasing denominator and increasing numer- 
ator. 

To sum up: with either variable fixed, U increases and V decreases in the other variable. 



6.1.4 Bounds on U and V. 

We henceforth set c = c m i n = c max = 4.506. All the foregoing remains valid, but some in- 
equalities become tighter, starting with (|13|) and (fl~9|); in the latter we can now take (3 lmin = 
ftmin = 0.33018; (3 lmax = (3 2max = 0.52891; (3 3min = 0.077639; [3 3max = 0.21782; (j) min = 
0.525245; and 4> max = 0.619063. (Also, e.g. now R 3 (e, x max ) has a smaller value < 1.104 10~ n .) 
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These limitations imply helpful ones on U and V : U < U max i = 2.69268, {7 < 
£/m<W (V - 1) where U max2 = 0.687424, V > V minl = 1.109255, U/V > (U/V) maxl = 
11.2022; we need and prove better ones than the last two. 

We solve the constrained minimization problem (with variables and minimize V, 
subject to the 8 linear constraints written above. These define a convex polygonal domain 
with 7 sides in the plane (0, Due to the decreasing character of V in each variable, the 
minimum can obviously not be attained at an interior point, nor along any side other than 
the two given by /3 3 = /3 3max and j3 2 = l3 2min . V is easily seen to decrease in (3 1 along the 
first and to increase in f3 1 along the second, so the minimum is attained at their intersection, 
and is found equal to: 

B 2 

V min2 = 1 + — (0 n 2min r = 1-126983. 

We can maximize (U/V) in a very similar way, since it increases in each variable separately. 
Again, the maximum must be along one of the same two sides of the same polygon, as seen 
using the form U/V = 27 (1 - 0) (3 2 3 / [f3 2 (f3 1 + 60 - 3) 2 ] , and equals 



U\ 9 [2 (1 (3 3max ) P2min]@, 



2 

3max 



^ J max2 i^2min (/^2min 



x2 



1.64966. 



3max) 



6.2 Outline. 

From now on, c is taken to be equal to 4.506. The remainder of the paper is devoted to 
showing that for this c, the product 2^ +eA ) n E (X nj£jXmaxjC ) tends to as n — > oo. Since 
the probability of satisfiability decreases in c, this will establish that the threshold is below 
4.506. 



While Figure 1 clearly suggests that the system ( |29| |32|) has a unique solution, we present 
a strictly rigorous analysis. It exploits special features of this system, leading to numerical 
calculations which can be reliably and routinely performed to any desired precision. 

A close study of the 2 equations, written as Eq\ = and Eq 2 = 0, shows that each defines 
as a unique decreasing function of (3 X ; then a constructive numerical procedure is applied to 
narrow down the location of any common root. Uniqueness is neither assumed nor proven, 
though this could be done with a little more effort. 

Actually, it suffices to establish the (strict) monotony of Eqi and Eq 2 in each variable 
separately; the monotone behavior of the corresponding implicit functions follows. And it 
turns out that it is easier to reason directly in terms of this separate monotony, and that in 
this approach strictness is not used. 

There is a slight restriction to the monotony of Eq l7 which does not affect the end result. A 
precise statement follows. 
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Figure 1: The solutions (j3 1 , 0) of equations (29) (dashed 
line) and (30) (solid line) 



Proposition 6.1 i) Eq 2 decreases strictly in each variable separately over the whole domain 
of interest V^, and even over the wider set [<f) min , <p max } x [P lmin , P lmax }. 
(ii) For any G [<f) min , <p max ] (resp. any P x G [P lmin , P lmax \), there exists p\ ((f)) < 1 
(resp. (f>* (Pi) < 1) such that Eqi((p,.) < over the interval (0) , 1], (resp. such that 
Eqi (., /5 1 ) < over the interval [0* (P x ) , 1}) and that Eq 1 (0, .) decreases strictly on [0, (3\ ((f))] 
(resp. £'gi(.,/3 1 ) decreases strictly on [0, 0* (fix)])- In particular, if Eq\ (0, p ± ) = ; then 
p\ (0) > P 1 and 0* (PJ > 0. 

The remainder of this section is devoted to proving Proposition pMj We start with the 
equation that is monotone on the whole of T>^ x . 



6.3 The second equation, separate monotony. 

We use the modified form (^), and deal successively with fixed 0, variable /3 1; and the 
reverse. First, some considerations which apply to both. (B2) can be rewritten equivalently: 



i + (W' 2r (i - *) 



= B g2 = -/3 1 c + A*(l-i)+ J2 P^ yL^n !- ' 

V J l<2p<x<x max 1 ; . 

(33) 

Call the denominator of the last fraction D xp . We differentiate one variable, leaving the 
other fixed. The derivatives are denoted simply by a prime because the context will always 
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make the meaning clear. The following equality holds in either case. 



V - 1 



V (VP - 1) 



1 - 



-V'pV p+1 - (p+ 1) v p + 1 



V 2 



{Vp - 1) 



v-i (?) 



[/ \ 2p— 1 



viyp-i) di p 

It will be shown, in the relevant subsections, that 



(x — 2p) 



U^_V^ 

Tj~V 



1 



+ 



1 - 



pV 



Vp I Vp +1 



Lemma 6.2 Let X = 30 - 1, Y = 30 - /3 1} Z = Y - X = 1 - (3 X . We have 

U' V -V 



u v v(y-i) 

where R is a positive quantity such that for fixed 



R. 



while for fixed (3 X 

It follows that 

V - 1 



Y 



i?<S = |(y-l) + i. 



-V f - (p + i) yp + i 
1 (y P _ i) 2 

[7^-2p-l 

(x — 2p) R — p 



VpD 2 

x,p 



1 - 

V- 1 



Vp -I 



The two terms inside the curly brackets on the right will be called A x ^ p and B xp , respectively. 
We need to study the fraction in V that occurs in A x>p : 

p VP +1 - (p + 1) V p + 1 

Lemma 6.3 For nonzero p, the quantity ^ decreases in V > 1 (and 



tends to asV -»• 1 + 0/ 



(VP - 1) 



A standard exercise in derivatives and infinitesimals. 



6.3.1 Fixed 0, variable 

Proof of Lemma |0| (fixed 0). Note that (3 2 = -3X + 2Y, f3 3 = 2X - Y\ also, V'/V 



v-i 

Vf3 2 p 3 Y 



2X (Y - 3X) and U'/U = ^ (l - so 



_ F _ 

a v~ v(v -i) 



(V-l) 
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That R is positive results from the fact that (U/V) is increasing and V decreasing (in /3 X ). 
We now show that R can be expressed as a function of V alone. 



Proposition 6.4 For constant 4>, 



R 



3V 2 - 1 + 3Vy/V(V - V 



3V + 1 



and this function is concave in V > 1 . 



Proof. Indeed, Y/X = (R + 1) /V and V - I = (2 J - 3) 2 / [3 (2 - , whence 

second-degree relationship between V and i? + 1 which can be solved in R + 1 : 



fl+ 1 



3V 



3^ + 1 L 



V + l + uy/V(V -1) 



where to = ±1. The coefficient of u in R — 1/2 is larger than the cu-free term, while from 
the definition and (pj]), R— 1/2 is seen to equal 1/6 Y/X fl 2 / P3 which is positive on T>^ x . 
Therefore, uj must be +1. As for concavity, R" is found to have the sign of 11V 2 — 30V + 
3 - 16 (V - 1) y/VJy - 1), or, in terms of W = V - 1: 



11W 2 - 8W - 16 - 16WVW(W + 1) < -5W 2 - 8W - 16 < 0. 



Corollary 6.5 We have the following affine upper bound for R: whenever V > V m ini, a nd 
irrespective of the constant value of (p, 



R < axV + bi 



where a 1 = 2.4427 and b x = -1.8194. 



Proof. The curve is below its tangent at any point, but since we need better and better 
estimates as V decreases, it is appropriate to choose the tangent at V min 2- This gives the 
stated values of the coefficients. ■ 

Since — V'/V 2 > 0, in order to prove that |33] decreases in (3 1 it (amply) suffices to show that 
\(j) min >A + B, where A = ^ P^x, P Ac, P and B = ^ P^x,pB x , p , 

2<2p<X<X m ax 3<2p+l<x<x max 

and A0 min = 3 x 4.506 x 0.525245 > 7.1. 



For A, we use Lemma 6.3: 



A < P k ^p 

2<2p<x< ■I- -max 



(V p -I) 2 
26 



1 - 



U\ X ~ 2 P 
max2 



the inequality < 5 which is valid for any pair (r, s) of positive reals : 



or less than 1.894. As regards B, we keep only the positive terms. Using Corollary |675| and 

) of posii 

aV- \b\ 



3<2p- 

But 



p (x — 2p) K XyP 



A (VP- 1)' 



*v-\b\ _ lbl w\ v ~ 1 1 



(yp-1) 1 1 V-l 1 + V + V 2 + ... + VP- 1 ' 

Since a > \b\, the homographic fraction on the r.h.s. decreases in V > 1, as does the last 
fraction. So, our bound for S can be made independent of V > V min 2 by evaluating it at 
V m i n 2. This yields B < 4.2269, so the sum A + B is less than 6.125. This closes the case of 
fixed <fi, variable (3 X . 



6.3.2 Fixed /3 l5 variable 



so 



Proof of Lemma |6.2| (fixed (3 X ). Here we use Y and Z. Observing that (3 2 = —Y + 2>Z 
fa = Y- 2Z, we find V'/V = Qy^yZ (3Z - 2Y) and U'/U = - J ^ + ^- + 
that 

v(v-i) 3W(v-i) 
v (1 - <j>) /w 

As before, R must be positive, and since the first term is negative, it suffices to show 
that the sum of the remaining two has the expression stated above for S. Remarking that 

V — 1 = 4r ^3^- — 3^Z/r) an d a l so ^ = 3Y^y_2 z) ' we °btain from the expression of V'/V : 



3YV(V- 1) 



1 



V 2 V 



2Z (2Y - 3Z) 



P2P3V 
and conclude using (|TTD . 
We now express S in terms of V alone: 



1 V3Z-V 
1 = - H 

2 Z 2 



-1 1 

Y~ + 3 (V - 2Z) 



Proposition 6.6 For V > 1 and V 7^ 4/3, 



S 



1 3(v-r 



V-2 + y/V(V- 1) 



2 3V-4 

a concave function of V > 1 which does not actually have a singularity at V = 4/3. 



Proof. Since y/Z = (S - 1/2) / (V - 1) and V = (3 - 2|) 2 / [3} (f - 2)] , we have a 
relationship between S\ = (S — |) and IV = (V — 1) which is quadratic in each separately, 
= (31V - 1) S 2 - 61V (IV - 1) Sx - 9W 2 , which we solve in 



Si 



31V 
31V- 1 



W -l+Uy/W(W + 1) 
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However, since Z is constant, S cannot present a singularity at W = 1/3, which rules out 
uj = —1 in that vicinity, and by continuity for all W. (We can also, for W ^ 1/3, derive 
straight contradictions from u — — 1, as done in the fixed <ft case.) 
As to concavity, for < W ^ 1/3, we easily check that 

d 2 S -3 (5W + 9) 

: T < °' 

WW (W + if + (11W 2 + 30W + 3) y/W(W + l) 

for W = 1/3, this second derivative extends by continuity, hence d 2 S/dV 2 also exists and is 
negative there. ■ 



dV 2 4(^ + 1) 



Corollary 6.7 We have the following affine upper bound for R: whenever V > V min 2, and 
irrespective of the constant value of j3 1 , 

R < a x V + b u 

where a% = 2.2377 and b\ = —1.7173. 



Proof. The coefficients are those of the tangent to S at V — Knm2- Note that again, we 
have ai > |6i|. ■ 

Proving that ( p3|) decreases in now boils down to showing that, with similar notations to 
the above: 

Mmin > ~ ^~~yi — +A+B, where A = ^ P^x, P A x , p and B = ^ pK X)P B XjP . 

2<2p<X<X ma x 3<2p+l<X<Xmax 

(34) 

(of course, all derivatives are now understood to be in <fi for constant f3 v ) 

For A, we again have the bound 1.894. For B, the same estimate again applies, mutatis 
mutandis, i.e. with a\ and b\ replacing a and b respectively. This gives 

B< p{x-2p) K x , p ——p — — < 3.643. 

3<2p+l<x<x max \ V min2 ) 

And finally, from the expression of V'/V and V = (f3 1 + 6(f) — 3) 2 / (3/? 3 Y): 

V 18(1-^) 36(l-[3 l ) K y ^immi v ) 

which is maximized by equating the two factors on the right (with a constant sum), so that 

—A— — — ^ < A < 1 -^ < 0.566. 
V - 16 

Bringing all this together, irrespective of the constant value of for the right-hand side of 
((Ml) we obtain the bound 1.894 + 3.643 + 0.566 = 6.103, which is indeed less than 7.1. 
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6.4 The first equation, separate monotony. 



Actually,as already stated, monotony does not always hold on the whole of T>^p for the first 
equation 



at least not in (3 l} nor is it strictly needed in order reliably to locate any solution of the 
system. We shall only prove that, with one variable kept fixed : 

Claim 1. Eqi decreases from a positive to a negative value, then stays negative; 

i.e., the region where monotony may fail contains no solutions anyway. We write the fraction 

in © as l/(l + E x , p ). 

6.4.1 Fixed 0, variable (3 1 . 

Lemma 6.8 E x<p increases in (3 1 for x > 2p + 2, while for x = 2p + 1 it increases at least 
for 0</3 1 <(3* 1 where f3\ = V% - 3 (V% - l) 0. 

Proof. Note that 4> min > 1/3 and — 3 (\/3 — l) 4> max > 0, so that /?* is indeed between 

and 1. It is readily seen that E^\ = [27 (1 — 0) 0^f (J3i + 60 — 3) 3 ] , so that (with X, 
Y, Z as before) dE 4>1 /df3 1 has the sign of 2//3 3 - 3/ ((3 X + 60 - 3) , or of 2 {2Y - 3Z) - 
3 (Y — 2Z) = Y > 0. Therefore, for x > 2p + 2, recalling that U increases and V decreases, 
E XtP , which for p > 1 equals (y) X 2p 2 E^% (l + + ^ + ... + y^r) , clearly increases. For 
x = 2p + l, the derivative dE Ztl /df3 1 of E 3A = 27 (1 - 0) (3 2 f3 2 3 / {fl x + 60 - 3) 4 has the sign 
of -2/f3 2 + 2/^ - 4/ {(3 l + 60 - 3), or of '{Y 2 /X 2 - 3) . As p 1 increases from to 1, Y/X 
linearly decreases from 30/ (30 — 1) to 1, passing through \/3 for (3 1 = f3\. Therefore, 
E% i increases for < j3 1 < (3\, then decreases; and we see that for x — 2p + 1, E xp = 
-^3,1 (l + y + yz + ••• + yp=r) of necessity increases in ^ 1 for < f3 x < (3\. ■ 

Now consider Eq\ deprived from the terms such that x = 2p + 1; we call this Eq\. Ob- 
viously Eq\ < Eql, and from Lemma |0| Eq\ decreases for < (3 1 < 1. Thus, to 
prove Claim 1 it suffices to show that Eql is negative at (3\. However, at (3\ we have 
U = 3 (1 - 0) / (30 - 1) , V = 2/^/3, so that U/V decreases in > 1/3. Hence at /3{, 
whatever the fixed value of G [0 min , 4> max \' 



= E qi = K - A0 - 



k XjP (x - 2p) 1 



1 



(35) 



2<2p<x<x 




1 



which is less than —0.157 < 0. 
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6.4.2 Fixed ft, variable 0. 



Lemma 6.9 E X)P increases in for x > 2p + 3 (provided ft > (3 lmin ), while for 2p + 1 < 
x < 2p + 2 it increases at least for < < 0* ; where 



1 

12 



15 



9 



2(3, 



^(1-^)^4^ + 4^ + 3 



2-/?i 1 2 - ft 

27ms wi/we decreases from 3/4 to 1/3 as ft increases from to 1. 



Proof. As before, i?^ increases whenever E x _ 2p +2,i does, because = E x _ 2p+ 2,i x 
(l + ^ + ^2 + ... + y^rr), and since C//V is increasing, £^ )P increases whenever E XQ)P does 
for some Xq < x. Therefore all we have to show is that E 5t i increases, and that for < 
< 0* so does E 3>1 . Regarding the former, E B>1 = 3 9 (1 - <pf (3\/ [ft (ft + 60 - 3) 8 ] has a 
derivative dE 5y i/d(j) with the same sign as —3/ (1 — 0)+3//? 2 +18//? 3 — 48/ (ft + 60 — 3), or as 
2X 2 -7ZX+Z 2 (7 - Z). However, the discriminant Z 2 (8Z - 7) = (1 - ft) 2 (1 - 8ft) of this 
quadratic function of X remains negative so long as ft > ft min , hence the required monotony. 
Coming now to dE^/dcf), it has the sign of -1/ (1 - 0) - 3/ft + 6/ft - 24/ (ft + 60 - 3), 
or of 2 (1 + Z) X 2 -Z (11 + 2Z) X + Z 2 (11 - Z). This equals zero for 



x = z - 

4 



: - (9 + cjv^^Z 2 - 12Z+ 11 

+ 1 V 



(36) 



where = ±1; however, recalling that cannot exceed 1 — 2ft/3 = (1 + 2Z) /3 lest ft 
should become negative, we see that X — 2Z cannot be > 0, while if u were +1, X — 2Z 
would be the product of Zj [Z + 1) by a strictly decreasing function of Z reaching for 
Z = 1. Hence u = +1, and 0* is then read from flBTf). The last assertion is straightforward. 



Now consider Eq\ deprived from the terms such that 2p + 1 < x < 2p + 2; we call this 



Eq\* . Obviously Eq\ < Eq™, and from Lemma E>15 Eq\* decreases for < < 1. Thus, to 
prove Claim 1 it suffices to show that for any ft e [ft m i„, ft ma d > ^(Z** is negative at 0*. 
Call the corresponding values of U and V, as functions of ft, Z7* and V* respectively. In a 
moment, we will show that U*/V* and V* behave like their unstarred, fixed-0 counterparts, 
i.e., the first increases and the second decreases in ft. Then, an upper bound for Eql* in 
some interval [ftx,ft#] is given by M [ft£,ft#] defined as: 

1 



K - A0* [/3 1H ] - k ^p ( x ~ 2 P) 

■■' ma x 



1 - 



1 + (£) [ft,]"- 2 " (1 - (^fc. 

Straightforward numerical calculation yields (still, of course, for x max = 56) 

M[.33,.39] < -.051, M [.39, .428] < -.051, 
M [.428, .468] < -.062, M [.468, .529] < —.055, 

establishing our final point. Now as promised: 
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Lemma 6.10 V* is a decreasing, U* /V* an increasing function of f3 1 G [0, 1]. 



Proof. Set A = 9 - V3V3 + AP X + 4#, B = A- 2/3 v so that 3 < A < 6, 2 < B < 4, and 

V * = 3(A-B) (A + 3BY T* = lil-PJQB-A) [3 (1 + 2Pl) + ( ° " ^ )] ' 

We take derivatives w.r.t. p u noting that A' = -6 (1 + 2/3 x ) / (9 - A). First, V*' has the 



sign of AB' - A'B = 2/(9 - A) 30 ft + 21 - 9^^ + 4 /?i + 4 /?i < 0. As for [7*/^*, 
the factor in square brackets on the right is increasing, so it suffices to show that each 
of (A — B) / (1 — fii) and (A — B) / (3B — A) increases too. The derivative of the latter 
has the sign of A'B — AB', positive as we have just seen; the derivative of the former is 
[A' (l-p 1 )+A-2]/(l- Z^) 2 , so has the sign of -6 (1 + (1 - (3 X ) + (A - 2) (9 - A) = 



7V3 V3 + 40! + AP\ - 15 - 18 P x > 0. 



This ends the proof of Proposition |6.1| . Although it can be made slightly more precise 
and then used to show the existence, uniqueness, and globally decreasing character of the 
implicit functions defined by Eq\ and Eq 2 , we will not do so, since we do not need to. We 
could actually remove the word 'strict' and still proceed to the final subsection. 



6.5 Root localization. 



Using Proposition |6.1| , we shall show that any (feasible) common root ((f)*, P\) to Eqi and 
Eq 2 must lie in a small rectangle 71 = [(f) _ , (f) + \ x , and that on the whole of 71, our 
bound (|31| ) for the expectation is strictly less than 1. Since we already know the (global) 
maximizer for (p2|) to exist and to necessarily give rise to such a common root for which, 
besides, ( pT|) will be valid, this will show our chosen value of c, 4.506, to be above the 
threshold without even the need for a direct proof of either existence or uniqueness of such 
a ((f)*, PI). 

We determine 7Z explicitly, together with four numerical sequences which witness to the 
fact that no solution can lie outside TZ, owing to Corollary |6. 13 below. This amounts, in 



a rigorous presentation, to a very elementary numerical procedure which starts at a corner 
of the rectangle [(f> min , 4> m ax\ x [Pimim ftimaxl containing T>^ x , and spirals its way towards a 
solution. 



From Proposition 3.1 follows 



Lemma 6.11 Let Eq be either Eq x or Eq 2 , A = {(I>a,Pia)> B = {^bjPib)- (%) IfEq(A) > 
and B < A (meaning (f> B < (f> A and P l B < Pia)> then Eq (B) > 0; (ii) If Eq (A) < and 
A < B, then Eq (B) < 0. 

Proof. Do it in two steps, changing one coordinate at a time; for (i), use monotony; for 
(ii), use monotony if Eq = Eq 2 , and if Eq = Eq\ use the fact that if Eq\ is negative, then 
it stays so if a single coordinate is increased. ■ 
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This in turn implies 

Proposition 6.12 Let A = (<Pq,Pia) and B = [4>qi Pi b) have the same (p- coordinate, while 
C = wc-iflio) an d D = ((/>£>, Pio) have the same P x - coordinate. 

(i) If (3 1B < (3 lA , Eq 1 (A) > and Eq 2 (B) < 0, then the closed rectangle [<f> min , 4> max ] x 
[^ibj/^ia] contains no common root to Eqi and Eq 2 . 

(ii) If Pia < Abi Eqi(A) < and Eq 2 (B) > 0, then the closed rectangle [4> m i n ,4> max \ X 
\Px ai Pi,b\ contains no common root to Eqi and Eq 2 . 

(Hi) If 4>c < Eq 2 {C) < and Eqi (D) > 0, then the closed rectangle [4>c,4>d] x 
[Pimini Pimax] contains no common root to Eq 1 and Eq 2 . 

(iv) If 4>d < 0c> Eq 2 (C) > and Eq±(D) < 0, then the closed rectangle [4>di4>c\ x 
[Pimim Pimaxl contains no common root to Eq\ and Eq 2 . 



Proof, (i) Let P = (0, be an arbitrary point of the rectangle. We show that if < O 
then P is not a solution of Eqi, while if > O , P fails to satisfy Eq 2 . Indeed, in the former 
case we have P < A, so we use (%) of Lemma |6.11| with Eq = Eq\\ in the latter, B < P, so 



we apply (ii) of the same lemma with Eq = Eq 2 . 

(ii), (Hi) and (iv) Very similar (or actually the same up to notation). ■ 

As an obvious corollary, we obtain the final link leading to the main result of this paper: 

Corollary 6.13 Let four finite sequences O < <p^ < ... < (f) K , j3^ > P^ > ... > P\ K , 
4>t > <f>t > ••• > Pi,o < PT,i < ■■■ < Pi,l ^th 0o = min , P+ = P lmax , 0+ = (t) max and 

Pl,0 = Plmin Ven fV : 

E qi (cl>r,p+.) > 0, 0<i<K, Eq 2 (<j)T,p+ i+1 ) <0, < i < K - 1, 

Eq t (0+ PI-) < 0, < j < L, Eq 2 (0+ P w ) > 0, < j < L - 1. 

Then no feasible common solution to Eqi and Eq 2 can lie outside the rectangle 10^, 0j [ X 

]Pl,L,PtA- 



Proof. Successive applications of Proposition |6.12| (i) with A = (0^ , P^A and B 



> Pti+i) exclude the band [0 min , <p max } x [Pl >K+1 , P lmax ] . We similarly exclude 

\_Plmim Pl,L+l\ i [0min5 0x] X [Plrniw Plmax] > anC ^ [0L> ^max] X [Plmim Plmax}- " 

All that remains to do is explicitly to give our four sequences as above, and to check that 
the hypotheses of the corollary obtain and that the bound fl3"T| ) is uniformly strictly less than 
one on the whole rectangle 1Z = [0^,0^] x [/3^ L ,/?^] (the bound being independent of 
sufficiently large n). 



We first compute the product G± (e, x max ) G 2 (e, x max ) exp (2eD/e) appearing in fl3~T|), to be 
less than 1 + 10~ 7 \ 
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We then determine sequences (f> { and Pf t as above, with K = 62, and sequences 4>f and f3 1 i 
with L = 52 satisfying the requirements of Corollary |6.13| , and such that 1Z = [</>^, 0jj~] x 
fc^tJ = [0.56383217,0.56383249] x [0.44651403,0.44651478]. Taking into account the 
monotony properties of Z7, V, and of the functions x i— > x 2 , x i— > x y and i i-> j/ 1 , an upper 
bound for the right-hand side of (|3~I"D throughout 1Z is seen to be the product of (6ra 3 ) 1 ^"' by 



(1 + 10~ 7 ) x 3 C 



6e 



2^ 



J! [p!(x-p)!^.J^ 

• ' III a r 



X 



n 



V — 1 1 0<2p<ic<:r„ 



(30i - /?r,J 



[3(1-^)] 



3 1- 



^ 2 3/3. 



where C/ = U (faft^), V = V (fa,?^), U = U(4>^^ L ), V = V (4>t, (3+ K ) , K = 
3 (1 - <f>t) -2Pi tK , and (3 3 = /3^ L -2 + 30 x . The product of this bound by 2^ +eA < 1 + 10~ 14 
is computed to be < 0.9999885. So, for c = 4.506, x max = 56, and e = 10~ 15 , the product 
2 ( P +sA)n E (X n)S)Xmax , c ) is less than 6n 3 0.9999885™, and we conclude using Proposition W?A 
and the decreasing character of Pr n>c (SAT) as a function of c. 
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Appendix A 

Proof of Lemma |1.1| (As stated, what we prove is actually stronger.) In the ordered- 
clauses model, if the number of occurrences of variable i is Ki, the random vector (Ki) 1<i<n 
follows a multinomial law of parameters Xn,pi = ... = p n = 1/n, where A = 3c. Also, the 
number of positive occurrences of variable i is modelled by the r.v. Si = S^! 1 Xj J , with X it j 
i.i.d. B (1, 1/2) coin tosses, constructed to be independent of the multinomial vector (cf. Th. 
2.19 of |H). For xeN, S i>x = EJ=i X ij has a binomial B (x, 1/2) distribution: 

Pr = p \Ki = x) = Pr (S i>x =p\K l = x) = Pi (S i>x = p) = ^ Q (37) 

The number of variables having x occurrences is N x = T% =1 lsK i=x \, while those having x 

occurrences out of which p are positive is W x>p = Y^h=i l{Ki=x,Si=p}- 

We use the large deviation property of binomial r.v.s in the following form: 

Define h (q, t) = (q + t) Log (1 + t/q) + (1 - q - t) Log (1 -t/(l-q)) if t < 1 - q, +oo 

otherwise; and c(q,t) = min {h (q, t) , h (1 — q, t)} . Let Y be the sum of n independent 

indicator variables with common expectation q; then for any e > 0, 



Pr 





Y 




( 


q 




n 





(38) 



We also use the fact that a Poisson r.v. with integral mean p, cannot have too small a 
probability of equalling p: there is, as can be seen from a variant of Stirling's formula, an 
absolute constant C > such that if Z is Poisson with integer parameter p > 1, then 

Pt(Z = aO>-%. (39) 
Recall that the Poisson probability mass function, e~ x X x fx\, is denoted by p (x, A). 



Now consider a Poisson r.v. M with mean An, and construct (e.g., Lemma 5.9 in p6 



random vector (Lj) , 1 < i < n that is multinomially distributed conditionally on M, i.e.: 

Pr ((Li) = (k) \M = m') = ( ^ ^ ^ ) n~ m ' . 

Probabilities and expectations in the Poissonized model will be subscripted with a A. In 
particular, 

Pr A ((Li) = (k) \M = \n) = Pr ((K^ = (/,,)) . (40) 
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The law of the vector (Li) is obtained by deconditioning, giving a sum with just one nonva- 
nishing term: 



Pr A ((Li) = (h)) = ^Pr A ((L J ) = (/ i )|M = m')Pr A (M = m / ) 



W p 



A 



i=l 



So (summing w.r.t. all coordinates but one), the Lj are independent, each being Poisson with 
mean A. We let X[ ■ be i.i.d. coin tosses in the Poissonized model which are also (completely) 
independent of the vector (Li, M) , so that in this model, the 'number of occurrences of 
variable V is S\ = J2j=i-^'i,j- We also consider, for x G N, S' ix = Y^=i-^'i,ji which has a 
binomial B (x, 1/2) distribution; on account of our independence hypotheses 



Pr A (SI = p\Li = x) = Pr A (S' i x = p\U = x) = Pr A (5^. = p) = ^ ^ j . 



(41) 



In terms of the indicators U{(x) = 1{l,=x}, Vi(p) = lr s , =p y, and W^'^ = lr L . =x ^ =p \ = 
Ui (x) Vi (p), the 'number of variables with x occurrences, p among them positive' is W' x p = 
Y17=i x p- We will need the following lemma, to be proved later. 

Lemma A.l In the setup just defined, the law ofW' xp! conditional on M = An, is the same 
as the law of W XjP . 

Clearly then, by ©: 

e \ w Up = PrA ( S 'i = P\ Li = x ) Prx ( Ll = x ) = ^ (Z)P ( x ' A ) = ^-p- 
By (|38|) we have: 



Pr A 



w 

x,p 



n 



h x,p 



> £ ] < 2e" c(£ ' K ^ )n . 



We now depoissonize, i.e. we decompose w.r.t. the values of M: 



2e -c(e,K x , p )n > ^ Pr A ^ 



> Pr A 



W 7 

x,p 



K 



71 



x.p 



>e\M = m^j Pr A (M = m') 
>e \M = An J Pr A (M = An) . 



By Lemma ( [A.l ) and (|39|) , this implies 



Pr 



x.p 



K 



n 



>e\< —y/Xne- c ^" )n , 
Co 



which is stronger than Lemma ( [L.l|) . 
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Proof of lemma [A.l| . The law of a random vector determines that of the sum of its 

components, and the same holds for the conditional laws relative to some event (e.g., fl3"3| , 
p. 218, end of § 14). 

So, it is sufficient to show that the law of the random vector (W,!J).^.. , conditional on 

i \ iu,p/ l<i<n 

M = Xn, is the same as the law of (W / i, :c ,p) 1 < i < n , where Wi, x , p = l {Ki=XtSi=p} = l{ K . =x} l{ Si=p} ; 
and this, in turn, will follow if we show that the conditional law of the 2?r,-dimensional random 
vector (Li, S'i) 1<i<n * s ^ ne same as the law of (Ki, Si) 1<i<n . Now, for any N n -valued vectors 

&)l<i<n alld fe)l< 4 <n' 

Pr A ((Li = k, S' { = Pi) \M = \n)= Pr A ((L { = k, S' lM = Pi ) \M = Xn) 

Here the event A = C\i<%< n {^i i- — Pi} is independent of the conjunction B DC, with 
B = f!i<*<n { L i = k} and "C = {M = Xn}, so Pr A (A \BC\C) = Pr A (A). Applying the 
generally-valid 

P (A n B \C) = P (A \B n C) P (B \C) , 



and using (pEOD , we see that 

Pr A ((Li = k, S[ = pi) \M = Xn)= Pr A ((S' iM = Pi )) Pr A ((K t = k)) . (42) 

Although the Ki are not independent, our setup does ensure that the events f] 1<i<n {Sij f = Pi} 
and f] 1<i<n {Ki = U} are independent, so 

Pr ((Ki = k, Si, k = pi)) = Pr ((S i:h = pi)) Pr ((Ki = k)) . (43) 

But, by (jilD and (|3T|), the first factors on the right in (^2]) and fl43| ) are both equal to 
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